Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)
ÇѱÛÁ¦¸ñ(Korean Title) |
¹®¼ ºÐ¼®À» À§ÇÑ ¹«ÇÑ ÀáÀç ÁÖÁ¦ ¸ðÇü |
¿µ¹®Á¦¸ñ(English Title) |
Infinite Latent Topic Models for Document Analysis |
ÀúÀÚ(Author) |
½ÅºÀ±â
Bong-Kee Sin
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 45 NO. 07 PP. 0701 ~ 0707 (2018. 07) |
Çѱ۳»¿ë (Korean Abstract) |
ÅäÇÈÀÇ °³³äÀº ¸Å¿ì Ãß»óÀûÀ̱⠶§¹®¿¡ ÅؽºÆ®ÀÇ ÅäÇÈ Ç¥ÇöÀ» Á¤ÀÇÇϱâ´Â ¸Å¿ì ¾î·Æ´Ù. ¹®Á¦ÀÇ ¸Æ¶ôÀ̳ª ÇÊ¿ä¿¡ µû¶ó ´Ù¾çÇÑ ¼öÁØ¿¡¼ ÅäÇÈÀ» ±¸ºÐÇÒ ¼ö Àִµ¥ ÀÌ ¶§¹®¿¡ ¹®¼ ºÐ¼®À» ÀÚµ¿ÈÇϱⰡ ¾î·Æ°Ô µÈ´Ù. º» ³í¹®¿¡¼´Â ³Î¸® ¾Ë·ÁÁø Latent Dirichlet Allocation (LDA) ¸ðÇüÀ» ¹«ÇÑ ÅäÇÈ ¸ðÇüÀ¸·Î È®ÀåÇÏ´Â ¹æ¹ýÀ¸·Î ¹«ÇÑ ÀáÀç µð¸®½¶·¹ ÅäÇÈ ¸ðÇü°ú ¹«ÇÑ ÀáÀç ¸¶¸£ÄÚÇÁ ÅäÇÈ ¸ðÇüÀ» Á¦¾ÈÇÑ´Ù. ù° ¸ðÇüÀº µð¸®½¶·¹ °úÁ¤(Dirichlet process)À» ÀÌ¿ëÇÏ¿© LDA¿¡¼ ÅäÇÈÀÇ °íÁ¤µÈ °³¼ö Á¦ÇÑÀ» Ǫ´Â ¹æ¹ýÀÌ´Ù. µÑ° ¸ðÇüÀº ¿©±â¿¡ ¸¶¸£ÄÚÇÁÀÇ µ¿Àû ¿¬¼â Ư¡À» Ãß°¡ÇÏ¿© ÅؽºÆ® ÅäÇÈÀÇ ¼øÂ÷Àû º¯È ±¸Á¶¸¦ Æ÷ÂøÇÏ´Â ¹æ¹ýÀÌ´Ù. Á¦¾È ¸ðµ¨Àº ¸ðµÎ ¹®¼¸¦ ÀûÀýÇÑ ¼öÁØÀÇ ÅäÇÈ¿¡¼ ±¸¼ºÀ» ºÐ¼®ÇÒ ¼ö ÀÖ¾î¼ ÀÌ·ÐÀû ¾ö¹Ð¼º°ú ±¸Á¶Àû À¯¿¬¼ºÀ» Á¦°øÇÑ´Ù. ÀÏ·ÃÀÇ ½ÇÇèÀ» ÅëÇÏ¿© °üÂûµÈ ºÐ¼® °á°ú·Î ±âÁ¸ÀÇ LDA¿Í º¯ºÐ¹ý Ã߷п¡ ±â¹ÝÇÑ µ¿ÀÏ ¸ðÇü°ú ºñ±³ÇÏ¿´À» ¶§ º¸´Ù Á÷°üÀûÀÌ¸ç ±¹¼ÒÀû ÅäÇÈ Á¤»ó¼º(topic-stationarity)À» Àß º¸¿©ÁÖ´Â °ÍÀ» È®ÀÎÇØ ÁÖ¾ú´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
Since the concept of the topic is highly abstract, the characterization of the topics of a text is not clearly defined. Depending on the problem¡¯s context or needs, various levels of detail may be provided, which could make it difficult to automatically analyze documents. This paper presents infinite topic extensions to the well-known model of Latent Dirichlet Allocation (LDA) i.e., the infinite Latent Dirichlet Topic model and the infinite Latent Markov Topic model. The first model simply relaxes the constraint of fixed known number of topics in LDA using the method of the Dirichlet process. The second model further extends it by including Markov dynamics that captures the sequential evolution of topics in a text. Both models are theoretically rigorous and structurally flexible, as well as being capable of capturing document organizations at a desired level of topics. A set of experiments show interesting results and a more intuitive topic characterization and local stationarity properties than related models with Gibbs sampling and variational inferences.
|
Å°¿öµå(Keyword) |
ÅäÇÈ ¸ðµ¨
¹«ÇÑ ÀáÀç ÅäÇÈ
µð¸®½¶·¹ °úÁ¤
¸¶¸£ÄÚÇÁ ¿¬¼â
±é½º Ç¥Áý
topic model
infinite latent topics
Dirichlet process
Markov chain
Gibbs sampling
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|